49 research outputs found
Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort
In the last decade drug overdose deaths reached staggering proportions in the
US. Besides the raw yearly deaths count that is worrisome per se, an alarming
picture comes from the steep acceleration of such rate that increased by 21%
from 2015 to 2016. While traditional public health surveillance suffers from
its own biases and limitations, digital epidemiology offers a new lens to
extract signals from Web and Social Media that might be complementary to
official statistics. In this paper we present a computational approach to
identify a digital cohort that might provide an updated and complementary view
on the opioid crisis. We introduce an information retrieval algorithm suitable
to identify relevant subspaces of discussion on social media, for mining data
from users showing explicit interest in discussions about opioid consumption in
Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5
million users were geolocated at the US state level, resembling the census
population distribution with a good agreement. A measure of prevalence of
interest in opiate consumption has been estimated at the state level, producing
a novel indicator with information that is not entirely encoded in the standard
surveillance. Finally, we further provide a domain specific vocabulary
containing informal lexicon and street nomenclature extracted by user-generated
content that can be used by researchers and practitioners to implement novel
digital public health surveillance methodologies for supporting policy makers
in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach
The increasing availability of temporal network data is calling for more
research on extracting and characterizing mesoscopic structures in temporal
networks and on relating such structure to specific functions or properties of
the system. An outstanding challenge is the extension of the results achieved
for static networks to time-varying networks, where the topological structure
of the system and the temporal activity patterns of its components are
intertwined. Here we investigate the use of a latent factor decomposition
technique, non-negative tensor factorization, to extract the community-activity
structure of temporal networks. The method is intrinsically temporal and allows
to simultaneously identify communities and to track their activity over time.
We represent the time-varying adjacency matrix of a temporal network as a
three-way tensor and approximate this tensor as a sum of terms that can be
interpreted as communities of nodes with an associated activity time series. We
summarize known computational techniques for tensor decomposition and discuss
some quality metrics that can be used to tune the complexity of the factorized
representation. We subsequently apply tensor factorization to a temporal
network for which a ground truth is available for both the community structure
and the temporal activity patterns. The data we use describe the social
interactions of students in a school, the associations between students and
school classes, and the spatio-temporal trajectories of students over time. We
show that non-negative tensor factorization is capable of recovering the class
structure with high accuracy. In particular, the extracted tensor components
can be validated either as known school classes, or in terms of correlated
activity patterns, i.e., of spatial and temporal coincidences that are
determined by the known school activity schedule
Activity clocks: spreading dynamics on temporal networks of human contact
Dynamical processes on time-varying complex networks are key to understanding
and modeling a broad variety of processes in socio-technical systems. Here we
focus on empirical temporal networks of human proximity and we aim at
understanding the factors that, in simulation, shape the arrival time
distribution of simple spreading processes. Abandoning the notion of wall-clock
time in favour of node-specific clocks based on activity exposes robust
statistical patterns in the arrival times across different social contexts.
Using randomization strategies and generative models constrained by data, we
show that these patterns can be understood in terms of heterogeneous
inter-event time distributions coupled with heterogeneous numbers of events per
edge. We also show, both empirically and by using a synthetic dataset, that
significant deviations from the above behavior can be caused by the presence of
edge classes with strong activity correlations
Data on face-to-face contacts in an office building suggests a low-cost vaccination strategy based on community linkers
Empirical data on contacts between individuals in social contexts play an
important role in providing information for models describing human behavior
and how epidemics spread in populations. Here, we analyze data on face-to-face
contacts collected in an office building. The statistical properties of
contacts are similar to other social situations, but important differences are
observed in the contact network structure. In particular, the contact network
is strongly shaped by the organization of the offices in departments, which has
consequences in the design of accurate agent-based models of epidemic spread.
We consider the contact network as a potential substrate for infectious disease
spread and show that its sparsity tends to prevent outbreaks of rapidly
spreading epidemics. Moreover, we define three typical behaviors according to
the fraction of links each individual shares outside its own department:
residents, wanderers and linkers. Linkers () act as bridges in the
network and have large betweenness centralities. Thus, a vaccination strategy
targeting linkers efficiently prevents large outbreaks. As such a behavior may
be spotted a priori in the offices' organization or from surveys, without the
full knowledge of the time-resolved contact network, this result may help the
design of efficient, low-cost vaccination or social-distancing strategies
On the Dynamics of Human Proximity for Data Diffusion in Ad-Hoc Networks
We report on a data-driven investigation aimed at understanding the dynamics
of message spreading in a real-world dynamical network of human proximity. We
use data collected by means of a proximity-sensing network of wearable sensors
that we deployed at three different social gatherings, simultaneously involving
several hundred individuals. We simulate a message spreading process over the
recorded proximity network, focusing on both the topological and the temporal
properties. We show that by using an appropriate technique to deal with the
temporal heterogeneity of proximity events, a universal statistical pattern
emerges for the delivery times of messages, robust across all the data sets.
Our results are useful to set constraints for generic processes of data
dissemination, as well as to validate established models of human mobility and
proximity that are frequently used to simulate realistic behaviors.Comment: A. Panisson et al., On the dynamics of human proximity for data
diffusion in ad-hoc networks, Ad Hoc Netw. (2011
Collective Response to Media Coverage of the COVID-19 Pandemic on Reddit and Wikipedia: Mixed-Methods Analysis
Background: The exposure and consumption of information during epidemic outbreaks may alter people’s risk perception and trigger behavioral changes, which can ultimately affect the evolution of the disease. It is thus of utmost importance to map the dissemination of information by mainstream media outlets and the public response to this information. However, our understanding of this exposure-response dynamic during the COVID-19 pandemic is still limited. Objective: The goal of this study is to characterize the media coverage and collective internet response to the COVID-19 pandemic in four countries: Italy, the United Kingdom, the United States, and Canada. Methods: We collected a heterogeneous data set including 227,768 web-based news articles and 13,448 YouTube videos published by mainstream media outlets, 107,898 user posts and 3,829,309 comments on the social media platform Reddit, and 278,456,892 views of COVID-19–related Wikipedia pages. To analyze the relationship between media coverage, epidemic progression, and users’ collective web-based response, we considered a linear regression model that predicts the public response for each country given the amount of news exposure. We also applied topic modelling to the data set using nonnegative matrix factorization. Results: Our results show that public attention, quantified as user activity on Reddit and active searches on Wikipedia pages, is mainly driven by media coverage; meanwhile, this activity declines rapidly while news exposure and COVID-19 incidence remain high. Furthermore, using an unsupervised, dynamic topic modeling approach, we show that while the levels of attention dedicated to different topics by media outlets and internet users are in good accordance, interesting deviations emerge in their temporal patterns. Conclusions: Overall, our findings offer an additional key to interpret public perception and response to the current global health emergency and raise questions about the effects of attention saturation on people’s collective awareness and risk perception and thus on their tendencies toward behavioral change.Peer ReviewedPostprint (published version